This page is meant to serve as a tutorial to using R for network analysis. There are already many excellent tutorials available online, such as those created by Katya Ognyanova, Jesse Sadler, Janpu Hou, among others. Other resources that are helpful as you are learning network analysis in R include: the Statnet Project Workshops wiki, Francios Briatte’s Awesome Network Analysis archive, and the R graph gallery, Keith McNulty’s Handbook of Graphs and Networks in People Analytics, and Wickham’s and Grolemund’s R for Data Science reference book.

Some R Basics for Network Analysis

The following section includes some basic tasks that you are likely to use while doing network analysis in R. The main idea here is to start to become familiar with the R language. For the purposes of network analysis, it is helpful to understand how R makes use of vectors, factors, matrices, lists, and data frames. Please note that this file is meant to focus on R code; see the lectures and slides to learn about the concepts being discussed.

Vectors

The c() function allows you to create vectors in R.

v1 <- c(77, 3, 14, 102, 66) # this is a numeric vector with a length of 5
v2 <- c(1, 1, 1, 1, 1) # another vector with length 5
length(v1) # verify length of the vector v1
## [1] 5
v3 <- c("network", "analysis", "is", "super", "fun") # this is a character vector with a length of 5
v4 <- c("FALSE", "TRUE") # this is a logical vector
class(v1) # verify numeric vector
## [1] "numeric"
class(v3) # verify character vector
## [1] "character"
class(v4) # verify character vector
## [1] "character"

You can combine vectors, but some restrictions will be imposed if you combine different types. For example, if I combine v1 (numeric) with v2 (string/character), then R automatically treats this as a character vector. Numerics can be characters, but characters cannot be numerics. There are a variety of other operations you can do with vectors:

v4 <- c(v1,v3) # create new vector that combines multiple
class(v4) # verify character vector after combining 
## [1] "character"
v5 <- v1 + v2 # create a new vector with element-wise addition (must be same length)
head(v5) # since v2 was a vector of 1s, v5 should add 1 to each element in v1
## [1]  78   4  15 103  67
v5 <- v1 + 1 # You can also just add 1 (or any number) to the vector
v6 <- v2 * 2 # Or multiply the elements, in this case by 2
head(v6)
## [1] 2 2 2 2 2
sum(v2) # adds all the elements in the vector
## [1] 5
mean(v1) # takes the mean of the elements
## [1] 52.4
sd(v1) # standard deviation of the elements
## [1] 42.32375
cor(v1, v5) # correlation between vectors of same length
## [1] 1

Factors

Factors are a way for us to store data as categories. R will assign each distinct category an integer value, but that does not mean this is a vector of quantitative values. The integers are simply a way for R to organize the data into levels. For example, if I create a factor of UWB school affiliation among a hypothetical academic committee, you will see that each school is a level and has a corresponding value (1=BUS, 2=IAS, 3=NUR, 4=SES, 5=STEM).

UWBschools <- factor(c("IAS", "BUS", "IAS", "STEM", "BUS", "NUR", "NUR", "SES"))
UWBschools
## [1] IAS  BUS  IAS  STEM BUS  NUR  NUR  SES 
## Levels: BUS IAS NUR SES STEM
as.numeric(UWBschools)
## [1] 2 1 2 5 1 3 3 4

Ordered factors

Some factors have an order, such as levels of educational degree attainment:

degree = c("high school","associates","bachelors","masters","masters", "high school","high school","associates","bachelors","high school", "high school")
degree = factor(degree)
table(degree)
## degree
##  associates   bachelors high school     masters 
##           2           2           5           2

Note that R did not automatically order the categories. We can fix this:

degree = factor(degree,levels=c("high school","associates","bachelors","masters"), ordered = TRUE)
table(degree)
## degree
## high school  associates   bachelors     masters 
##           5           2           2           2

Again, even though R will store these values as integers, they are not quantitative variables. For instance, look what happens when I try to take the mean of the degree factor:

as.numeric(degree)
##  [1] 1 2 3 4 4 1 1 2 3 1 1
mean(degree)
## Warning in mean.default(degree): argument is not numeric or logical: returning
## NA
## [1] NA

Matrices

We have seen from lecture that matrices are a powerful tool to represent networks. Although much of our work will revolve around using edgelists (for computational efficiency), it is essential to be able to perform basic operations with matrices in R.

m1 <- matrix(data=1, nrow = 10, ncol = 5) # here is a 10 x 5 matrix of 1s
m1
##       [,1] [,2] [,3] [,4] [,5]
##  [1,]    1    1    1    1    1
##  [2,]    1    1    1    1    1
##  [3,]    1    1    1    1    1
##  [4,]    1    1    1    1    1
##  [5,]    1    1    1    1    1
##  [6,]    1    1    1    1    1
##  [7,]    1    1    1    1    1
##  [8,]    1    1    1    1    1
##  [9,]    1    1    1    1    1
## [10,]    1    1    1    1    1
dim(m1)
## [1] 10  5

We can also create a matrix by combining vectors:

m2 <- cbind(v1,v2,v5) # binds the previous vectors as columns to form a 9x3 matrix
m2
##       v1 v2  v5
## [1,]  77  1  78
## [2,]   3  1   4
## [3,]  14  1  15
## [4,] 102  1 103
## [5,]  66  1  67
dim(m2)
## [1] 5 3
m2v2 <- rbind(v1,v2,v5) # binds the vectors as rows to form a 3x9 matrix
m2v2
##    [,1] [,2] [,3] [,4] [,5]
## v1   77    3   14  102   66
## v2    1    1    1    1    1
## v5   78    4   15  103   67
dim(m2v2)
## [1] 3 5

Basic matrix operations

There are a plethora of matrix operations that we can perform, far too many to recreate here. A really important operation, though, is the transpose function and matrix multiplication. This is particularly helpful when we work with two-mode / bipartite networks.

t(m2) # transpose of m2
##    [,1] [,2] [,3] [,4] [,5]
## v1   77    3   14  102   66
## v2    1    1    1    1    1
## v5   78    4   15  103   67
m2t <- t(m2) # similar but now we have created a new data object
m2t
##    [,1] [,2] [,3] [,4] [,5]
## v1   77    3   14  102   66
## v2    1    1    1    1    1
## v5   78    4   15  103   67

Now we can post-multiply m2 * m2t to create a square 5x5 matrix:

m2 %*% m2t
##       [,1] [,2] [,3]  [,4]  [,5]
## [1,] 12014  544 2249 15889 10309
## [2,]   544   26  103   719   467
## [3,]  2249  103  422  2974  1930
## [4,] 15889  719 2974 21014 13634
## [5,] 10309  467 1930 13634  8846

We can also perform element-wise multiplication:

m2*m2
##         v1 v2    v5
## [1,]  5929  1  6084
## [2,]     9  1    16
## [3,]   196  1   225
## [4,] 10404  1 10609
## [5,]  4356  1  4489

Arrays

Arrays are a special type of matrix that involves more than two dimensions. This can be helpful when we explore networks defined by multiple relations. Here is an example of a 5x4x2 array:

a1 <- array(data=1:20,dim=c(5,4,2))
a1
## , , 1
## 
##      [,1] [,2] [,3] [,4]
## [1,]    1    6   11   16
## [2,]    2    7   12   17
## [3,]    3    8   13   18
## [4,]    4    9   14   19
## [5,]    5   10   15   20
## 
## , , 2
## 
##      [,1] [,2] [,3] [,4]
## [1,]    1    6   11   16
## [2,]    2    7   12   17
## [3,]    3    8   13   18
## [4,]    4    9   14   19
## [5,]    5   10   15   20

Lists

In R, lists are collections of many different kinds of objects that we have created above, such as vectors, matrices, logical values, character strings, and so on. Lists are created using the list() function.

list1 <- list("IAS", "STEM", "BUS","SES","NUR", c(12,1,34), TRUE, 23, 6)
list1
## [[1]]
## [1] "IAS"
## 
## [[2]]
## [1] "STEM"
## 
## [[3]]
## [1] "BUS"
## 
## [[4]]
## [1] "SES"
## 
## [[5]]
## [1] "NUR"
## 
## [[6]]
## [1] 12  1 34
## 
## [[7]]
## [1] TRUE
## 
## [[8]]
## [1] 23
## 
## [[9]]
## [1] 6

Here is another list that contains a character vector, a matrix, and a list:

list2 <-list(c("IAS", "STEM", "BUS","SES","NUR"), matrix(c(1:10), nrow = 5), list("Toyota Tacoma",40))
names(list2) <- c("UWB Schools","A 5x2 Matrix","Meaningless List") ## we can assign names to the different elements in our list
list2
## $`UWB Schools`
## [1] "IAS"  "STEM" "BUS"  "SES"  "NUR" 
## 
## $`A 5x2 Matrix`
##      [,1] [,2]
## [1,]    1    6
## [2,]    2    7
## [3,]    3    8
## [4,]    4    9
## [5,]    5   10
## 
## $`Meaningless List`
## $`Meaningless List`[[1]]
## [1] "Toyota Tacoma"
## 
## $`Meaningless List`[[2]]
## [1] 40
list2[1] ## we can also access specific elements from the list
## $`UWB Schools`
## [1] "IAS"  "STEM" "BUS"  "SES"  "NUR"
list2$`A 5x2 Matrix` ## accesses an element by name
##      [,1] [,2]
## [1,]    1    6
## [2,]    2    7
## [3,]    3    8
## [4,]    4    9
## [5,]    5   10
list2[[4]] <- "UWB" ## adds a 4th (new) element to the list
list2
## $`UWB Schools`
## [1] "IAS"  "STEM" "BUS"  "SES"  "NUR" 
## 
## $`A 5x2 Matrix`
##      [,1] [,2]
## [1,]    1    6
## [2,]    2    7
## [3,]    3    8
## [4,]    4    9
## [5,]    5   10
## 
## $`Meaningless List`
## $`Meaningless List`[[1]]
## [1] "Toyota Tacoma"
## 
## $`Meaningless List`[[2]]
## [1] 40
## 
## 
## [[4]]
## [1] "UWB"

Data Frames

It is difficult to overstate the importance of data frames in R when it comes to analysis, including network analysis. Data frames are a type of list that typically include columns of vectors or factors and rows that show how the data are distributed within them. Let’s create a basic data frame.

df1 <- data.frame( ID=1:5,
                   School=c("IAS","BUS","STEM","SES","NUR"),
                   certification=c(F,F,F,T,T),
                   size_percent=c(15,15,29,4,8))
head(df1)
##   ID School certification size_percent
## 1  1    IAS         FALSE           15
## 2  2    BUS         FALSE           15
## 3  3   STEM         FALSE           29
## 4  4    SES          TRUE            4
## 5  5    NUR          TRUE            8

Just as with any list, we can perform all kinds of operations, such as accessing specific elements. We can also retrieve information about our data. For example, suppose we want to retrieve all schools with size greater than 5 percent:

df1[df1$size_percent>5,2]
## [1] "IAS"  "BUS"  "STEM" "NUR"

And, of course, we can generate descriptive statistics where appropriate, such as the average size of schools that do not offer any professional certifications:

mean(df1[df1$certification==FALSE,4])
## [1] 19.66667

Basics of Drawing Networks in iGraph

## Load iGraph. You will need to do this every time you start an R session.
library(igraph)
## 
## Attaching package: 'igraph'
## The following objects are masked from 'package:stats':
## 
##     decompose, spectrum
## The following object is masked from 'package:base':
## 
##     union

Let’s create some graphs

## Note here that I have created a graph object, g1, and defined the undirected 
## edges with a vector involving three vertices.
g1 <- graph(edges = c(1,2, 2,3, 3,1), n=3, directed = F)
## Warning: `graph()` was deprecated in igraph 2.1.0.
## ℹ Please use `make_graph()` instead.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.
plot(g1, vertex.color="grey")

# Now with 20 vertices. Note that igraph uses directed edges (aka "arcs") by default.
# Also, vertices can be part of a network even if they are disconnected (i.e., "isolates"):
g2 <- graph (edges=c(1,2, 1,3, 3,1, 4,5, 5,7, 6,7, 4,7, 10,11, 12,1, 13, 5, 14,14, 15, 13), n=20)
plot(g2, edge.arrow.size=.5, vertex.color="grey")

## You can also create graphs with names instead of node symbols.
g3 <- graph_from_literal(IAS-BUS:STEM-NUR, IAS-NUR-SES) # the ':' operator joins vertex pairs
plot(g3, vertex.shape="none", vertex.label.color="black")

## this can be directed as well
g4 <- graph_from_literal(IAS-+BUS:STEM+-NUR, IAS+-+NUR+-SES)
plot(g4, vertex.shape="none", vertex.label.color="black")

## let's combine the graphs using the disjoint union operator %du% 
plot(g1 %du% g3, vertex.size=10, vertex.label=NA, vertex.color="grey")
## Warning: Duplicate vertex names in disjoint union.

## we can "rewire" these graphs by randomly changing the connections
g1g3 <- g1 %du% g3
## Warning: Duplicate vertex names in disjoint union.
g1g3 <- rewire(g1g3, each_edge(prob = 0.5))
plot(g1g3, vertex.size=10, vertex.label=NA, vertex.color="grey")

Special Types of Graphs

## Empty graph (i.e., a graph with no edges). 
g0 <- make_empty_graph(50)
plot(g0, vertex.size=7, vertex.label=NA, vertex.color="grey")

## Complete graph (i.e., all possible edges between vertices). Use light gray 
## for edges to set them into the background.
gC <- make_full_graph(50)
plot(gC, edge.color="lightgray", vertex.size=7, vertex.label=NA, vertex.color="grey")

## Ring graph
gR <- make_ring(25)
plot(gR, vertex.size=8, vertex.label=NA, vertex.color="grey")

## Star graph
gStar <- make_star(50, mode = "undirected")
plot(gStar, vertex.size=7, vertex.label=NA, vertex.color="grey")

## Tree graph
gT <- make_tree(50, children = 4, mode="undirected")
plot(gT, vertex.size=7, vertex.label=NA, vertex.color="grey")

## Lattice graph
gL <- make_lattice(dimvector = c(5,5))
plot(gL, vertex.size=10, vertex.label=NA, vertex.color="grey")

Random Graphs

# Erdos-Renyi G(N,L)
gER <- sample_gnm(n=300,m=250)
plot(gER, vertex.size=3, vertex.label=NA, vertex.color="grey")

gER_deg <- degree_distribution(gER)
barplot(gER_deg, names.arg = c(0:6), xlab = "degree", ylim = c(0,0.4))

## Barabasi-Albert preferential attachment model for scale-free graphs
gBA <- sample_pa(n=300, power=1.0, m=1, directed = FALSE)
plot(gBA, vertex.size=3, vertex.label=NA, vertex.color="grey")

## Watts-Strogatz small-world model
gSW <- sample_smallworld(dim=2, size=20, nei=1, p=0.1)
plot(gSW, vertex.size=5, vertex.label=NA, vertex.color="grey", 
     layout=layout_in_circle)

Reading in Network Data Sets

Make sure to set your working drive. If working in an R script file, use the setwd() function. If working in markdown then use the following (but change the path):

    # knitr::opts_knit$set(root.dir = normalizePath("ADD YOU FILE PATH HERE")) 

Edgelists

This data set is used in a series of practice tutorials by SSRI at Duke University. The nodes are researchers and graduate students and the links are colleague ties (i.e., x knows y). The data were collected at a conference that the researchers were attending. Note that if you are working in an r script file, then you will need to set your working directory with the setwd() command. You will need to add the path location in parentheses in between quotes.

## Start by reading in the edge list
Colleague_Links <- read.csv('PCMI_Personally Know_Combined Edgelist.csv', header= T, as.is = T)
## Then read in the attributes of the nodes
Colleague_Nodes <- read.csv('PCMI_Know Personally_Combined_Nodelist.csv', header=T, as.is = T)
## Inspect the two data sets
head(Colleague_Links)
##   Source Target Weight
## 1    181     19      1
## 2    181    150      1
## 3    181    159      1
## 4    181    181      1
## 5    181    263      1
## 6    134      1      1
head(Colleague_Nodes)
##   Source     Status
## 1     19 Researcher
## 2    181 Researcher
## 3    150 Researcher
## 4    159 Researcher
## 5    263 Researcher
## 6      1 Researcher
nrow(Colleague_Nodes); length(unique(Colleague_Nodes$Source))
## [1] 158
## [1] 158
nrow(Colleague_Links); nrow(unique(Colleague_Links[,c("Source", "Target")]))
## [1] 1558
## [1] 1558
#Create a graph object to work with iGraph
Colleague_Graph <- graph_from_data_frame(d=Colleague_Links, vertices=Colleague_Nodes, directed=T)

# Inspect individual components of the graph
E(Colleague_Graph)
## + 1558/1558 edges from 533bea5 (vertex names):
##  [1] 181->19  181->150 181->159 181->181 181->263 134->1   134->25  134->72 
##  [9] 134->168 134->169 134->179 134->204 134->223 134->239 134->253 134->276
## [17] 134->283 134->286 9  ->154 9  ->263 9  ->264 9  ->272 9  ->277 9  ->283
## [25] 9  ->308 72 ->1   72 ->25  72 ->26  72 ->39  72 ->43  72 ->44  72 ->65 
## [33] 72 ->72  72 ->146 72 ->154 72 ->168 72 ->179 72 ->204 72 ->209 72 ->212
## [41] 72 ->226 72 ->228 72 ->232 72 ->252 72 ->253 72 ->264 72 ->268 72 ->272
## [49] 72 ->286 72 ->301 72 ->302 72 ->307 72 ->308 72 ->309 212->1   212->40 
## [57] 212->41  212->43  212->72  212->101 212->131 212->168 212->169 212->173
## [65] 212->179 212->190 212->205 212->212 212->223 212->228 212->239 212->254
## [73] 212->264 212->269 212->274 212->279 212->283 212->286 212->291 212->292
## + ... omitted several edges
V(Colleague_Graph)
## + 158/158 vertices, named, from 533bea5:
##   [1] 19  181 150 159 263 1   134 25  72  168 169 179 204 223 239 253 276 283
##  [19] 286 154 9   264 272 277 308 26  39  43  44  65  146 209 212 226 228 232
##  [37] 252 268 301 302 307 309 40  41  101 131 173 190 205 254 269 274 279 291
##  [55] 292 304 128 34  62  94  142 170 216 238 76  46  162 219 29  54  289 132
##  [73] 56  89  224 92  310 21  194 213 64  299 14  47  195 27  32  200 201 71 
##  [91] 172 210 66  248 57  161 61  86  7   113 137 36  189 90  112 91  12  116
## [109] 82  24  157 28  143 42  176 149 217 3   167 87  53  155 136 38  192 222
## [127] 290 305 306 229 175 163 114 227 234 235 287 221 280 295 267 257 140 117
## [145] 158 249 250 258 4   237 265 293 123 231 35  275 281 246
V(Colleague_Graph)$Status
##   [1] "Researcher"       "Researcher"       "Researcher"      
##   [4] "Researcher"       "Researcher"       "Researcher"      
##   [7] "Graduate Student" "Researcher"       "Researcher"      
##  [10] "Researcher"       "Researcher"       "Researcher"      
##  [13] "Researcher"       "Researcher"       "Researcher"      
##  [16] "Researcher"       "Researcher"       "Researcher"      
##  [19] "Researcher"       "Researcher"       "Graduate Student"
##  [22] "Researcher"       "Researcher"       "Researcher"      
##  [25] "Researcher"       "Researcher"       "Researcher"      
##  [28] "Researcher"       "Researcher"       "Researcher"      
##  [31] "Researcher"       "Researcher"       "Researcher"      
##  [34] "Researcher"       "Researcher"       "Researcher"      
##  [37] "Researcher"       "Researcher"       "Researcher"      
##  [40] "Researcher"       "Researcher"       "Researcher"      
##  [43] "Researcher"       "Researcher"       "Researcher"      
##  [46] "Researcher"       "Researcher"       "Researcher"      
##  [49] "Researcher"       "Researcher"       "Researcher"      
##  [52] "Researcher"       "Researcher"       "Researcher"      
##  [55] "Researcher"       "Researcher"       "Graduate Student"
##  [58] "Researcher"       "Researcher"       "Researcher"      
##  [61] "Researcher"       "Researcher"       "Researcher"      
##  [64] "Researcher"       "Graduate Student" "Researcher"      
##  [67] "Researcher"       "Graduate Student" "Researcher"      
##  [70] "Researcher"       "Researcher"       "Graduate Student"
##  [73] "Graduate Student" "Researcher"       "Researcher"      
##  [76] "Graduate Student" "Researcher"       "Graduate Student"
##  [79] "Graduate Student" "Graduate Student" "Researcher"      
##  [82] "Researcher"       "Graduate Student" "Researcher"      
##  [85] "Researcher"       "Graduate Student" "Graduate Student"
##  [88] "Graduate Student" "Graduate Student" "Graduate Student"
##  [91] "Graduate Student" "Graduate Student" "Graduate Student"
##  [94] "Researcher"       "Graduate Student" "Graduate Student"
##  [97] "Graduate Student" "Graduate Student" "Graduate Student"
## [100] "Graduate Student" "Graduate Student" "Graduate Student"
## [103] "Graduate Student" "Graduate Student" "Graduate Student"
## [106] "Graduate Student" "Graduate Student" "Graduate Student"
## [109] "Graduate Student" "Graduate Student" "Graduate Student"
## [112] "Graduate Student" "Graduate Student" "Graduate Student"
## [115] "Graduate Student" "Graduate Student" "Graduate Student"
## [118] "Graduate Student" "Researcher"       "Graduate Student"
## [121] "Graduate Student" "Graduate Student" "Graduate Student"
## [124] "Graduate Student" "Graduate Student" "Graduate Student"
## [127] "Graduate Student" "Graduate Student" "Graduate Student"
## [130] "Graduate Student" "Graduate Student" "Graduate Student"
## [133] "Graduate Student" "Graduate Student" "Graduate Student"
## [136] "Graduate Student" "Graduate Student" "Graduate Student"
## [139] "Graduate Student" "Graduate Student" "Graduate Student"
## [142] "Graduate Student" "Graduate Student" "Graduate Student"
## [145] "Graduate Student" "Graduate Student" "Graduate Student"
## [148] "Graduate Student" "Graduate Student" "Graduate Student"
## [151] "Graduate Student" "Graduate Student" "Graduate Student"
## [154] "Graduate Student" "Graduate Student" "Graduate Student"
## [157] "Graduate Student" "Graduate Student"

Let’s make a quick plot of the network:

plot(Colleague_Graph, edge.arrow.size=.4,vertex.label=NA, vertex.size=4)

If the goal was to make something hideous and incomprehensible, success! Let’s do a few things to clear this up.

#Remove self-loops (actors who nominate themselves)
Colleague_Graph2<-simplify(Colleague_Graph, remove.loops=TRUE)
plot(Colleague_Graph2, edge.arrow.size=.4,vertex.label=NA, vertex.size=4)

Removing the self-loops improved things a bit, but it would help to differentiate the researchers and grad students.

V(Colleague_Graph2)$color <- ifelse(Colleague_Nodes[V(Colleague_Graph2), 2] == "Researcher", "blue", "red") 
E(Colleague_Graph2)$color <- "grey" 
plot(Colleague_Graph2, edge.arrow.size=0.25, vertex.label = NA, vertex.size=4)

There are many more ways we can improve this visualization, but we’ll stop here for now. Later, we will see that can change the symbol shapes and sizes based on certain characteristics of nodes, such as their role and prominence in terms of network activity.

Matrix Format

Matrices are another common data format that we will read into iGraph for analysis and visualization. To begin, we will use a data set that features a network of teachers at a school (many thanks to Alan Daly for the use of these data). The nodes are teachers and the links are “seeks advice about teaching from.” The ties are thus directed.

Start by reading in the adjacency matrix and the node attributes.

teachers_links <- read.csv('TeachersAskAdvice.csv', header = T, row.names = 1)
teachers_nodes <- read.csv('TeachersAttributes.csv', header = T, as.is = T)
head(teachers_links)
##    AA AB AC AD AE AF AG AH AI AK AL AM AN AP AR AS AT AU AW AX AY AZ BA BB BC
## AA  0  0  0  0  0  0  0  0  0  0  1  0  0  0  0  0  0  0  0  0  0  0  0  0  0
## AB  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  1  0  0  0  0  0
## AC  0  1  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
## AD  0  0  0  0  1  0  0  0  1  0  0  0  0  0  0  0  0  0  0  0  1  0  0  0  0
## AE  0  0  0  1  0  1  0  0  0  1  0  0  0  0  0  0  0  0  1  0  1  0  0  0  0
## AF  1  1  0  1  0  0  0  0  1  1  1  0  0  0  0  1  1  1  0  1  1  0  0  0  0
##    BD BE BF BG BI BK
## AA  0  0  0  0  0  0
## AB  0  0  0  0  0  0
## AC  0  0  0  0  0  0
## AD  0  0  0  0  0  0
## AE  0  1  0  0  0  0
## AF  0  0  0  0  1  0
head(teachers_nodes)
##   ID Gender Gradelevel Position YrsCurPos YrsCurSite OverallTrust
## 1 AA      1          1        1         8          6     3.000000
## 2 AB      1          1        1         1          1     2.166667
## 3 AC      2          1        1         2          2     2.833333
## 4 AD      2          2        1         2          2     3.000000
## 5 AE      1          2        1         3          3     3.000000
## 6 AF      1          2        2         1          1     2.666667
##   DistClimateLrn
## 1            2.8
## 2            2.2
## 3            2.4
## 4            2.8
## 5            2.6
## 6            2.0

Once we read in the matrices (links and node attributes), we once again create a graph object as we did above. This time, we will use the graph_from_adjacency_matrix() function since the data is in matrix form.

teachers_links <- as.matrix(teachers_links) # create a new data object that tells R to read the teachers_links file as a matrix
teachers <- graph_from_adjacency_matrix(teachers_links) # then create a graph object as an adjacency matrix
plot(teachers, edge.arrow.size=.3, vertex.label=NA, vertex.size=8, edge.curved=.1, vertex.color="gray50") # create a basic plot using curved edges

# Network Visualization Igraph is a highly flexible package for visualizing networks, but the drawback is that the range of options can be overwhelming. In the following, I will demonstrate some of the visualization features of igraph. You can also use the ?igraph.plotting command for additional help.

Node Attributes

Often times it is helpful to color code or size the nodes based on some attribute (see Dr. Ying Wei’s R colors cheat-sheet). There are multiple ways to do this. In this example, I used a combination of 1. creating an igraph object that defines some attribute of the nodes, and 2. defining an attribute directly in the plot command. More specifically, I created an igraph object of vertex color based on gender. Then, in the plot, the vertex size is scaled based on teachers’ number of years in their current position. For a project in which you may want to utilize color schemes frequently or do more complex dataviz, creating an igraph object may be preferable. If you are just creating a basic plot, then specifying the attributes in the plot codes is probably sufficient. Note that if you create an igraph object for nodes or links, you can override those objects in the plot code.

V(teachers)$color <- ifelse(teachers_nodes[V(teachers), 2] == "1", "gray50", "gold") # create an igraph object for vertex color
plot(teachers, edge.arrow.size=.2, vertex.label=NA, vertex.size=teachers_nodes$YrsCurPos*2, main="Teacher Advice-Seeking") # plot the network and size the nodes using teachers' number of years in their current position 
colors <- c("gray50", "gold")
legend(x=-1.5, y=-1.1, c("Women","Men"), pch=21,
       col="#777777", pt.bg=colors, pt.cex=1.5, cex=.8, bty="n", ncol=1) 

Edge Attributes

It is also possible to color code the edges based on an attribute of the relation. In the plot below, the color of the edge is linked to the gender of the sending (i.e., seeks advice from) node. Using curved edges here is helpful for instances in which ties are reciprocated.

edge.start <- ends(teachers, es=E(teachers), names=F)[,1] 
edge.col <- V(teachers)$color[edge.start]
plot(teachers, edge.arrow.size=.2, edge.color=edge.col, edge.curved=.1, vertex.label=NA, vertex.size=teachers_nodes$YrsCurPos*2)

Highlighting Groups of Nodes

# Mark multiple groups:
plot(teachers, vertex.label=NA, edge.arrow.mode=0, mark.groups=list(c(1,4,5,8), c(15:17)), 
          mark.col=c("#C5E5E7","#ECD89A"), mark.border=NA)

Network Layouts

Deciding how to present the layout of a network can have a dramatic impact on how others interpret the results. Strictly speaking, only the sets of vertices and edges define a network, but there are certain principles that can be used to establish a layout that adds interpretive value to the network.

Force Directed Layouts

Force-directed layouts draw from a metaphor of nodes having electrical charges that repulse one another, while the edges act as springs that attract (also why these are sometimes called a spring embedded layouts). Thus, subsets of nodes that are highly interconnected will be positioned closer together than those that are less connected. The Fruchterman-Reingold layout is one of the most popular of the force-directed layouts in network analysis. This algorithm attempts to find the layout that minimizes the energy in the system.

In this first example, you can see that the same network looks slightly different each of the four times it was run since the Fruchterman-Reingold layout is non-deterministic. The nodes and edges are the same in each graph; it’s just that the placement of them varies.

par(mfrow=c(2,2), mar=c(0,0,0,0))
plot(teachers, layout=layout_with_fr, edge.arrow.size=.3, vertex.label=NA, vertex.size=8, vertex.color="gray50")
plot(teachers, layout=layout_with_fr, edge.arrow.size=.3, vertex.label=NA, vertex.size=8, vertex.color="gray50")
plot(teachers, layout=layout_with_fr, edge.arrow.size=.3, vertex.label=NA, vertex.size=8, vertex.color="gray50")
plot(teachers, layout=layout_with_fr, edge.arrow.size=.3, vertex.label=NA, vertex.size=8, vertex.color="gray50")

In some cases, however, you may want to be able to reproduce the network layout in the exact form. In this case, as with any random process, you use the set.seed() function. After setting the seed to an arbitrary starting point (2022), you can see that each graph layout is identical.

par(mfrow=c(2,2), mar=c(0,0,0,0))
set.seed(2022)
plot(teachers, layout=layout_with_fr, edge.arrow.size=.3, vertex.label=NA, vertex.size=8, vertex.color="gray50")
set.seed(2022)
plot(teachers, layout=layout_with_fr, edge.arrow.size=.3, vertex.label=NA, vertex.size=8, vertex.color="gray50")
set.seed(2022)
plot(teachers, layout=layout_with_fr, edge.arrow.size=.3, vertex.label=NA, vertex.size=8, vertex.color="gray50")
set.seed(2022)
plot(teachers, layout=layout_with_fr, edge.arrow.size=.3, vertex.label=NA, vertex.size=8, vertex.color="gray50")

It is also possible to condense or expand the network layout by creating an igraph object of the layout and using a multiplier.

layout_fr <- layout_with_fr(teachers)
par(mfrow=c(1,2), mar=c(0,0,0,0))
set.seed(2022)
plot(teachers, layout=layout_fr*.1, rescale=F, edge.arrow.size=.3, vertex.label=NA, vertex.size=8, vertex.color="gray50")
set.seed(2022)
plot(teachers, layout=layout_fr*.2, rescale=F, edge.arrow.size=.3, vertex.label=NA, vertex.size=8, vertex.color="gray50")

The Kamada Kuwai is a similar force directed layout.

set.seed(2022)
plot(teachers, layout=layout_with_kk, edge.arrow.size=.3, vertex.label=NA, vertex.size=8, vertex.color="gray50")

Multidimensional Scaling

Multidimensional scaling is a data reduction tool that attempts to position objects (in this case nodes) in a (usually) two-dimensional space based on a measure of (dis)similarity, such as Euclidean distance or Jaccard. It has an intuitive interpretation in that the proxmities of the nodes can be interpreted directly. In addition, it is possible to interpret the underlying dimensionality of the proximities, although great care needs to be taken when doing so (usually by reference to some theoretical construct).

plot(teachers, layout=layout_with_mds, edge.arrow.size=.3, vertex.label=NA, vertex.size=8, vertex.color="gray50")

layouts <- grep("^layout_", ls("package:igraph"), value=TRUE)[-1]
# Remove layouts that do not apply to our graph.
layouts <- layouts[!grepl("bipartite|merge|norm|sugiyama|tree", layouts)]

par(mfrow=c(3,3), mar=c(1,1,1,1))
for (layout in layouts) {
  print(layout)
  l <- do.call(layout, list(teachers)) 
  plot(teachers, vertex.label=NA, edge.arrow.mode=0, layout=l, main=layout) }
## [1] "layout_as_star"
## [1] "layout_components"
## [1] "layout_in_circle"
## [1] "layout_nicely"
## [1] "layout_on_grid"
## [1] "layout_on_sphere"
## [1] "layout_randomly"
## [1] "layout_with_dh"
## [1] "layout_with_drl"

## [1] "layout_with_fr"
## [1] "layout_with_gem"
## [1] "layout_with_graphopt"
## [1] "layout_with_kk"
## [1] "layout_with_lgl"
## [1] "layout_with_mds"

Measures of Whole Networks

There are a variety of measures one can use to describe the structural characteristics of networks.

Components

The default mode for the component function is to return weak components.

components(teachers)
## $membership
## AA AB AC AD AE AF AG AH AI AK AL AM AN AP AR AS AT AU AW AX AY AZ BA BB BC BD 
##  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1 
## BE BF BG BI BK 
##  1  1  1  1  1 
## 
## $csize
## [1] 31
## 
## $no
## [1] 1

Note that the output provides the total number of components, the size of each, and the component membership for each node.

Strong components can be generated by specifying the mode in the code.

components(teachers, mode = "strong")
## $membership
## AA AB AC AD AE AF AG AH AI AK AL AM AN AP AR AS AT AU AW AX AY AZ BA BB BC BD 
##  3  3  2  3  3  3  3  1  3  3  3  3  3  3  3  3  3  3  3  3  3  3  3  3  3  3 
## BE BF BG BI BK 
##  3  3  3  3  3 
## 
## $csize
## [1]  1  1 29
## 
## $no
## [1] 3

Diameter

The diameter provides the length of the longest geodesic in a graph. This can be combined with the farthest_vertices function to identify the vertex names so that the path can be highlighted in the network plot.

diameter(teachers)
## [1] 9
farthest_vertices(teachers)
## $vertices
## + 2/31 vertices, named, from b5ab3c1:
## [1] AP AE
## 
## $distance
## [1] 9
## highlight the path of the diameter
teacher.diameter <- shortest_paths(teachers,
                             from = V(teachers)["AP"],
                             to = V(teachers)["AE"],
                             output = "both")
ecol <- rep("gray70", ecount(teachers)) 
ecol[unlist(teacher.diameter$epath)] <- "tomato"
ew <- rep(2, ecount(teachers)) 
ew[unlist(teacher.diameter$epath)] <- 4
vcol <- rep("gray40", vcount(teachers)) 
vcol[unlist(teacher.diameter$vpath)] <- "gold"
plot(teachers, vertex.color=vcol, edge.color=ecol, edge.width=ew, edge.arrow.mode=0,
     vertex.label.cex=.7)

Density

The simplest and most widely used measure of cohesion is network density, which expresses the ratio of observed ties to the total number of possible ties, n(n-1) for a directed graph.

edge_density(teachers, loops=F) 
## [1] 0.1311828

Thus, in the teachers’ advice network, 13.1% of all possible directed ties are observed.

Compactness

How “clumpy” is the network? Compactness is one measure that can answer this question. Igraph does not have a function for compactness, but it is possible to write a function yourself.

compactness <- function(teachers) {
        gra.geo <- distances(teachers) ## generate geodesic distances
        gra.rdist <- 1/gra.geo  ## reciprocal of geodesics
        diag(gra.rdist) <- NA   ## assign NA to diagonal
        gra.rdist[gra.rdist == Inf] <- 0 ## replace infinity with 0
          # Compactness = mean of reciprocal distances
        comp.igph <- mean(gra.rdist, na.rm=TRUE) 
        return(comp.igph)
        }
compactness(teachers)
## [1] 0.5600358
compactness(Colleague_Graph2)
## [1] 0.4804967

The lower the value of compactness, the less cohesive the network. Whether a network is compact or not is always a relative question, so there isn’t a direct answer with this single measure. If we look at the colleague graph from above, it can be seen that the teacher advice graph is more compact than the colleague graph (at least descriptively). However, we have to be careful about directly comparing networks of different size and edge types (e.g., small networks tend to be more dense).

Reciprocity

There are two global measures that are commonly used to characterize the extent of reciprocity in a network: arc and dyadic. The former characterizes the proportion of all outgoing arcs that are reciprocated, whereas the latter is the proportion of all adjacent dyads that are symmetric.

dyad_census(teachers) # mutuals assymetrics and nulls
## $mut
## [1] 26
## 
## $asym
## [1] 70
## 
## $null
## [1] 369
reciprocity(teachers, mode = c("default")) ## arc reciprocity
## [1] 0.4262295
reciprocity(teachers, mode = c("ratio")) ## dyadic reciprocity
## [1] 0.2708333

Transitivity

This is also called the global clustering coefficient, as it provides a measure of the tendency for nodes to cluster together in a network. Specifically, the measure describes the proportion of closed triples relative to the total number of triples in the graph.

transitivity(teachers, type="global") # note that here the network is treated as undirected
## [1] 0.3434783

It is also possible to conduct a census of the different triadic states in a network. In directed networks, there are 16 possible states made up of mutuals, assymetrics and nulls (up; down; cyclical; transitive):
003 A, B, C, empty triad: 0 mutuals, 0 asymmetrics, 3 nulls
012 A->B, C
102 A<->B, C
021D A<-B->C
021U A->B<-C
021C A->B->C
111D A<->B<-C
111U A<->B->C
030T A->B<-C, A->C
030C A<-B<-C, A->C
201 A<->B<->C
120D A<-B->C, A<->C
120U A->B<-C, A<->C
120C A->B->C, A<->C
210 A->B<->C, A<->C
300 A<->B<->C, A<->C, completely connected triad: 3 mutuals, 0 asymmetrics, 0 nulls

triads <- triad.census(teachers) # optional: create an object so the states can be plotted
## Warning: `triad.census()` was deprecated in igraph 2.0.0.
## ℹ Please use `triad_census()` instead.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.
triads
##  [1] 2322 1153  488  106  100   70   76   82   24    2   19   21   13    5   11
## [16]    3
barplot(triads, names.arg = c("003", "012", "102", "021D", "021U", "021C",
                              "111D", "111U", "030T", "030C", "201", "120D",
                              "120U", "120C", "210", "300"), xlab = "Triadic State",
        cex.names = .6, cex.axis = .8, ylim = c(0,2500))

Centralization

How much of the network activity works through a central hub? Freeman’s degree centralization is a measure of the extent to which a network is dominated by a single node or relatively few set of nodes. Degree is used as the measure of centrality in this example, but other measures can be substituted depending on the research question (e.g., betweenness, closeness)

centr_degree(teachers)$centralization
## [1] 0.2088889

Centrality

Whereas measures of cohesion assess the overall structure of a network, measures of centrality focus on the roles and positions of individual nodes. Put another way, centrality measures seek to describe how individual nodes contribute to the overall structure of a network.There are numerous available measures, but the following are a common point of departure for such analyses.

Degree

The degree of a node is the number of lines incident to that node (i.e., the number of connections it has to other nodes). For directed networks, there can be in-degree (number of lines being “sent” to the focal node), out-degree (the number of lines originating from the focal node), and total degree (the sum of in-degree and out-degree).

degree(teachers, mode="in") # returns the degree of all nodes for the mode specified
## AA AB AC AD AE AF AG AH AI AK AL AM AN AP AR AS AT AU AW AX AY AZ BA BB BC BD 
##  6 12  0  5  5  8  3  0  7  4  3  7  3  4  2  2  1  7  2 11  7  2  3  1  1  3 
## BE BF BG BI BK 
##  3  2  3  3  2
degree(teachers, mode="in", normalized=TRUE) # normalized=TRUE divides raw degree by n-1 to express as a proportion
##         AA         AB         AC         AD         AE         AF         AG 
## 0.20000000 0.40000000 0.00000000 0.16666667 0.16666667 0.26666667 0.10000000 
##         AH         AI         AK         AL         AM         AN         AP 
## 0.00000000 0.23333333 0.13333333 0.10000000 0.23333333 0.10000000 0.13333333 
##         AR         AS         AT         AU         AW         AX         AY 
## 0.06666667 0.06666667 0.03333333 0.23333333 0.06666667 0.36666667 0.23333333 
##         AZ         BA         BB         BC         BD         BE         BF 
## 0.06666667 0.10000000 0.03333333 0.03333333 0.10000000 0.10000000 0.06666667 
##         BG         BI         BK 
## 0.10000000 0.10000000 0.06666667
# You can also create objects and then combine into a data frame:
teacher_degree <- degree(teachers, mode="total")
teacher_indeg <- degree(teachers, mode="in") 
teacher_outdeg <- degree(teachers, mode="out")
dat <- data.frame(teacher_degree, teacher_indeg, teacher_outdeg)
dat
##    teacher_degree teacher_indeg teacher_outdeg
## AA              7             6              1
## AB             13            12              1
## AC              1             0              1
## AD              8             5              3
## AE             11             5              6
## AF             20             8             12
## AG              6             3              3
## AH              3             0              3
## AI             13             7              6
## AK              7             4              3
## AL              9             3              6
## AM             12             7              5
## AN              5             3              2
## AP              5             4              1
## AR              5             2              3
## AS              3             2              1
## AT              5             1              4
## AU             10             7              3
## AW             16             2             14
## AX             15            11              4
## AY              9             7              2
## AZ             10             2              8
## BA              8             3              5
## BB              4             1              3
## BC              4             1              3
## BD              9             3              6
## BE              6             3              3
## BF              4             2              2
## BG              6             3              3
## BI              5             3              2
## BK              5             2              3

Once you have degree objects, then you can easily weight node sizes by those measures:

par(mfrow=c(1,3), mar=c(1,1,1,1))
set.seed(2022)
plot(teachers, edge.arrow.size=.2, vertex.label=NA, vertex.color="tomato",
     vertex.size=teacher_degree)
title(main="total degree", line= -5)
set.seed(2022)
plot(teachers, edge.arrow.size=.2, vertex.label=NA, vertex.color="tomato",
     vertex.size=teacher_indeg)
title(main="in-degree", line= -5)
set.seed(2022)
plot(teachers, edge.arrow.size=.2, vertex.label=NA, vertex.color="tomato",
     vertex.size=teacher_outdeg)
title(main="out-degree", line= -5)

Betweenness

Betweenness may be the most theoretically developed concept of centrality, at least in the social and behaviorial sciences. Betweenness captures how often a focal node falls along the shortest path between pairs of nodes in the network.

between <- betweenness(teachers, directed = TRUE, normalized = FALSE)
bet_normalized <- betweenness(teachers, directed = TRUE, normalized = TRUE)
bet.dat <- data.frame(between, bet_normalized)
bet.dat
##       between bet_normalized
## AA  45.333333    0.052107280
## AB 109.000000    0.125287356
## AC   0.000000    0.000000000
## AD  24.176190    0.027788725
## AE 101.742857    0.116945813
## AF 124.640476    0.143264915
## AG  27.500000    0.031609195
## AH   0.000000    0.000000000
## AI  80.173810    0.092153804
## AK  65.883333    0.075727969
## AL  78.683333    0.090440613
## AM  94.795238    0.108960044
## AN  37.226190    0.042788725
## AP  22.833333    0.026245211
## AR   0.000000    0.000000000
## AS   5.650000    0.006494253
## AT   3.452381    0.003968254
## AU  19.045238    0.021891078
## AW  92.009524    0.105758073
## AX 300.985714    0.345960591
## AY  29.726190    0.034168035
## AZ 103.195238    0.118615216
## BA 259.619048    0.298412698
## BB   5.009524    0.005758073
## BC   2.676190    0.003076081
## BD 148.900000    0.171149425
## BE   3.616667    0.004157088
## BF  49.566667    0.056973180
## BG   6.500000    0.007471264
## BI  57.892857    0.066543514
## BK  85.166667    0.097892720
plot(teachers, edge.arrow.size=.2, vertex.label=NA, vertex.color="tomato",
     vertex.size=between*.1, main="Nodes weighted by betweenness centrality")

K-Step Reach

K-step centrality is similar to degree, only it measures the number of nodes within K links of a given node (if K=1 then this is the same as degree). Although it is possible to derive K-step measures that respect direction, it is often the case that network analysts treat the data as undirected. There is no K-step function in igraph, but you can write your own function to do this. In this function (created by Diazaboro Shizuka and reproduced on RPubs), K is set to 2 (i.e., path length of 2) and the results are normalized to show the proportion of of nodes reachable by 2-steps (data treated as undirected).

reach2<-function(x){
  r=vector(length=vcount(x))
  for (i in 1:vcount(x)){
    n=neighborhood(x,2,nodes=i)
    ni=unlist(n)
    l=length(ni)
    r[i]=(l)/vcount(x)}
  r}

reach2(teachers)
##  [1] 0.8709677 1.0000000 0.4193548 0.6774194 0.7096774 0.9032258 0.5806452
##  [8] 0.6129032 0.9354839 0.8064516 0.7741935 0.8064516 0.5483871 0.8064516
## [15] 0.5483871 0.8387097 0.8064516 0.8387097 0.9677419 1.0000000 0.7096774
## [22] 0.9677419 0.7741935 0.5806452 0.8709677 0.8387097 0.6774194 0.3548387
## [29] 0.7096774 0.8709677 0.5161290
teacher_reach2 <- reach2(teachers)
set.seed(2022)
plot(teachers, edge.arrow.size=.2, vertex.label=NA, vertex.color="tomato",
     vertex.size=teacher_reach2*10)

## PageRank PageRank, developed by Google founders Larry Page and Sergey Brin, is an interesting measure of a node’s importance in networks. It’s hard to overstate how much this algorithm has impacted our lives (for better or worse is another discussion). At its core, PageRank is intuitive: a node’s importance is determined not just by how many connections it receives, but by the quality of those connections. A node gains higher PageRank when it receives connections from other nodes that themselves have high PageRank scores. This recursive relationship means that each node’s centrality score is continuously updated based on the changing scores of its connecting nodes. In the present context, a link from a highly respected teacher carries more weight than multiple links from teachers who are less often sought after for advice.

teacher_pagerank <- page_rank(teachers, directed = TRUE)
set.seed(2022)
plot(teachers, edge.arrow.size=.2, vertex.label=NA, vertex.color="tomato",
     vertex.size=teacher_pagerank$vector*100)

Community Detection

Fastgreedy

library(igraphdata) # Karate data is stored in this package
data(karate) # call the data set and store in environment
set.seed(5)
plot(karate, vertex.labels=NA, main="Zachary's Karate Club")
## This graph was created by an old(er) igraph version.
## ℹ Call `igraph::upgrade_graph()` on it to use with the current igraph version.
## For now we convert it on the fly...

zach <- make_graph("Zachary") # make a new graph object that excludes the default colors based on faction
fast <- cluster_fast_greedy(zach)
dendPlot(fast) # provides a dendrogram of the hierarchical community structure
## Warning: `dendPlot()` was deprecated in igraph 2.0.0.
## ℹ Please use `plot_dendrogram()` instead.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.

set.seed(5)
plot(fast,zach,vertex.label.color="black", vertex.size=14, vertex.label.size=1)

modularity(fast) # value of modularity for the fast.greedy community detection
## [1] 0.3806706
length(fast) # number of communities
## [1] 3
membership(fast) # community membership for each node
##  [1] 1 3 3 3 1 1 1 3 2 3 1 1 3 3 2 2 1 3 2 1 2 3 2 2 2 2 2 2 2 2 2 2 2 2
sizes(fast) # number of nodes in each community
## Community sizes
##  1  2  3 
##  8 17  9

If you find the shading and edge colors distracting, just plot the network based on membership:

set.seed(5)
plot(zach,vertex.label.color="black", vertex.size=14, vertex.label.size=1,
     vertex.color=membership(fast))

And the community colors can be customized using the RColorBrewer package:

library(RColorBrewer)
colors <- brewer.pal(length(fast),'Dark2') # create a palette 
V(zach)$color=colors[membership(fast)] # assign vertices to a color based on membership
set.seed(5)
plot(zach, vertex.label=NA)

Edge betweenness

This is the classic community detection algorithm based on work by Girvan-Newman.

gv <- cluster_edge_betweenness(zach, modularity = TRUE)
dendPlot(gv, mode="hclust")

set.seed(5)
plot(gv,zach,vertex.label.color="black", vertex.size=14, vertex.label.size=1)

modularity(gv)
## [1] 0.4012985

Louvain

multi <- cluster_louvain(zach)
set.seed(5)
plot(multi,zach,vertex.label.color="black", vertex.size=14, vertex.label.size=1)

modularity(multi)
## [1] 0.4188034

Infomap

The infomap algorithm can be used when working with directed graphs (as can edge-betweenness):

info <- cluster_infomap(teachers)
set.seed(5)
plot(info, teachers, edge.arrow.size=.2, vertex.label=NA)

Two-Mode Networks

Igraph is primarily designed to handle one-mode networks (i.e. direct ties between a set of nodes). However, the package contains a sufficient number of functions to do meaningful analysis of two-mode data (i.e., a set of nodes and a set of “events”). The tnet package offers a more targeted set of functions for two-mode data (as well as weighted and longitudinal networks), but it is not meant to replace the visualization capabilities of igraph. There is a newer package called bipartite developed for ecological networks, but I have not used it yet.

There are a few ways to analyze two-mode networks: 1. proceed as normal without any transformation of the data; 2. convert the two-mode dataset into separate one-mode datasets (one for each mode); or 3. analyze the network as a bipartite graph. The following provides setup for any of these approaches using a two-mode network of non-profit organizations in education (mode 1) and their policy preferences for education reform (mode 2) at the time of data collection (2017).

pie <- read.csv('nationalpartners.csv', header=T, row.names = 1)
pie <- as.matrix(pie)
partners <- pie%*%t(pie) # post-multiply the original matrix by its transpose to get a organization-by-organization matrix with weighted ties showing the number of policy preferences they share in common
beliefs <- t(pie)%*%(pie) # pre-multiply the transpose by the original matrix to get a policy preference-by-policy preference matrix with weighted ties showing the number of organizations they share in common
pienet <- graph_from_biadjacency_matrix(pie) # graph object from original two-mode matrix
pienet.bp <- bipartite_projection(pienet) # bipartite projection (helpful for visualizations)
partnernet <- graph_from_adjacency_matrix(partners, mode="undirected", 
                                          diag = FALSE, weighted = TRUE) # graph object of org-by-org network
beliefnet <- graph_from_adjacency_matrix(beliefs, mode = "undirected",
                                         diag = FALSE, weighted = TRUE) # graph object of preference-by-preference network
id <- c(1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24) # create node ids
mode <- c(1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2) # create mode membership
pie_nodes <- data.frame(id, mode) # create node data frame to differentiate modes

Two-mode visualization

plot(pienet, vertex.label.cex=.6, vertex.label.color="black", vertex.size=8, 
     vertex.color=ifelse(pie_nodes[V(pienet), 2] == 1, "lightgray", "lightblue"),
     vertex.frame.color="NA", vertex.label.dist=2)

Another option is to drop the vertices entirely and only visualize the labels (color coded by mode):

plot(pienet, vertex.shape="none", vertex.label.cex=.6, vertex.label.font=2,
     vertex.label.color=ifelse(pie_nodes[V(pienet), 2] == 1, "deepskyblue4", "palegreen3"))

A third option is to use the bipartite layout in the plot function:

plot(pienet, vertex.label=NA, vertex.size=8, 
     vertex.color=ifelse(pie_nodes[V(pienet), 2] == 1, "lightgray", "lightblue"),
     layout=layout.bipartite)

Two-mode analysis

You can generate density and centrality values for the original matrix, but some care has to be taken in the interpretation. For example, the density is 0.21, but by definition within mode ties are not possible.

edge_density(pienet) # caution! within mode ties are not possible so the value of density as traditionally defined is misleading. 
## [1] 0.2101449
58 / 135 # a better approach is to count the number of edges and then divide by nxm
## [1] 0.4296296

Running the degree or betweenness centrality scores will provide meaningful values. However, if you want normalized values, the standard approach in igraph will not work.

degree(pienet)  
##                     Center_American_Progress 
##                                            8 
##      Center_for_Reinventing_Public_Education 
##                                            6 
##                        Data_Quality_Campaign 
##                                            2 
##                Education_Resource_Strategies 
##                                            7 
##       Foundation_for_Excellence_in_Education 
##                                            8 
## National_Alliance_for_Public_Charter_Schools 
##                                            8 
##          National_Council_on_Teacher_Quality 
##                                            3 
##                              Education_Trust 
##                                            8 
##                            Fordham_Institute 
##                                            8 
##                               Choice_General 
##                                            4 
##                             Choice..Charters 
##                                            4 
##                               Equity_General 
##                                            5 
##                               Equity_Funding 
##                                            3 
##                           Equity_Instruction 
##                                            4 
##                   Accountability_Performance 
##                                            4 
##                      Accountability_Teachers 
##                                            3 
##                    Accountability_TestScores 
##                                            6 
##                            Standards_Leaders 
##                                            2 
##                           Standards_Teachers 
##                                            7 
##                          Standards_Readiness 
##                                            6 
##                              Support_Funding 
##                                            1 
##                        Support_Interventions 
##                                            3 
##                             Support_External 
##                                            4 
##                             Support_Families 
##                                            2
betweenness(pienet, directed = FALSE)
##                     Center_American_Progress 
##                                   41.0408716 
##      Center_for_Reinventing_Public_Education 
##                                   17.9824412 
##                        Data_Quality_Campaign 
##                                    0.4863354 
##                Education_Resource_Strategies 
##                                   33.9289219 
##       Foundation_for_Excellence_in_Education 
##                                   23.4456398 
## National_Alliance_for_Public_Charter_Schools 
##                                   32.7636454 
##          National_Council_on_Teacher_Quality 
##                                    4.3212815 
##                              Education_Trust 
##                                   25.1478041 
##                            Fordham_Institute 
##                                   20.8830591 
##                               Choice_General 
##                                    5.2120983 
##                             Choice..Charters 
##                                    5.2120983 
##                               Equity_General 
##                                    9.6799286 
##                               Equity_Funding 
##                                    4.6286033 
##                           Equity_Instruction 
##                                   10.3837282 
##                   Accountability_Performance 
##                                    4.3118688 
##                      Accountability_Teachers 
##                                    4.7358676 
##                    Accountability_TestScores 
##                                   19.7677342 
##                            Standards_Leaders 
##                                    1.2679128 
##                           Standards_Teachers 
##                                   36.6324167 
##                          Standards_Readiness 
##                                   17.9491456 
##                              Support_Funding 
##                                    0.0000000 
##                        Support_Interventions 
##                                    2.4858373 
##                             Support_External 
##                                    7.3602684 
##                             Support_Families 
##                                    1.3724921

Single Mode Analysis

Another common approach is to analyze the modes as distinct one-mode networks. It is important to keep in mind that the edges are not direct ties, but rather affiliations with the other mode.

plot(partnernet, edge.width=E(partnernet)$weight, vertex.label.cex=.8, vertex.label.color="black", 
     vertex.size=10,vertex.label.dist=2) # utilize the edge weight attribute that igraph creates to weight the edges

Here is the one mode projection of the other mode:

plot(beliefnet, edge.width=E(partnernet)$weight, vertex.label.cex=.8, vertex.label.color="black",
     vertex.size=10,vertex.label.dist=2) 

Note that transforming the two-mode network into one-mode networks tends to create highly dense networks and inflated degree scores since nodes often share at least one event in common. One option is to set a threshold for a tie based on the number of events that are shared in common, or remove events that all or nearly all nodes share in common. The following are the values when everything is left as is.

edge_density(partnernet)
## [1] 0.9722222
edge_density(beliefnet)
## [1] 0.8380952
degree(partnernet)
##                     Center_American_Progress 
##                                            8 
##      Center_for_Reinventing_Public_Education 
##                                            8 
##                        Data_Quality_Campaign 
##                                            7 
##                Education_Resource_Strategies 
##                                            8 
##       Foundation_for_Excellence_in_Education 
##                                            8 
## National_Alliance_for_Public_Charter_Schools 
##                                            7 
##          National_Council_on_Teacher_Quality 
##                                            8 
##                              Education_Trust 
##                                            8 
##                            Fordham_Institute 
##                                            8
degree(beliefnet)
##             Choice_General           Choice..Charters 
##                         12                         12 
##             Equity_General             Equity_Funding 
##                         13                         12 
##         Equity_Instruction Accountability_Performance 
##                         13                         12 
##    Accountability_Teachers  Accountability_TestScores 
##                         11                         13 
##          Standards_Leaders         Standards_Teachers 
##                          9                         14 
##        Standards_Readiness            Support_Funding 
##                         14                          7 
##      Support_Interventions           Support_External 
##                         11                         12 
##           Support_Families 
##                         11

Bipartite projections

Using the bipartite projection format defined in the setup above gives easy access to the different modes when visualizing two-mode networks:

par(mfrow=c(1,2))
plot(pienet.bp$proj1, vertex.label.color="black", vertex.label.dist=1.5, vertex.label.cex=.8, vertex.size=8, edge.width=E(partnernet)$weight)
plot(pienet.bp$proj2, vertex.label.color="black", vertex.label.dist=1.5, vertex.label.cex=.8, vertex.size=8, edge.width=E(partnernet)$weight)